When Should You Adjust Standard Errors for Clustering?∗
نویسندگان
چکیده
In empirical work in economics it is common to report standard errors that account for clustering of units. Typically, the motivation given for the clustering adjustments is that unobserved components in outcomes for units within clusters are correlated. However, because correlation may occur across more than one dimension, this motivation makes it difficult to justify why researchers use clustering in some dimensions, such as geographic, but not others, such as age cohorts or gender. This motivation also makes it difficult to explain why one should not cluster with data from a randomized experiment. In this paper, we argue that clustering is in essence a design problem, either a sampling design or an experimental design issue. It is a sampling design issue if sampling follows a two stage process where in the first stage, a subset of clusters were sampled randomly from a population of clusters, and in the second stage, units were sampled randomly from the sampled clusters. In this case the clustering adjustment is justified by the fact that there are clusters in the population that we do not see in the sample. Clustering is an experimental design issue if the assignment is correlated within the clusters. We take the view that this second perspective best fits the typical setting in economics where clustering adjustments are used. This perspective allows us to shed new light on three questions: (i) when should one adjust the standard errors for clustering, (ii) when is the conventional adjustment for clustering appropriate, and (iii) when does the conventional adjustment of the standard errors matter. JEL Classification: C14, C21, C52
منابع مشابه
A Practitioner's Guide to Cluster-Robust Inference
We consider statistical inference for regression when data are grouped into clusters, with regression model errors independent across clusters but correlated within clusters. Examples include data on individuals with clustering on village or region or other category such as industry, and state-year di erences-in-di erences studies with clustering on state. In such settings default standard erro...
متن کاملWhich, When, and How: Hierarchical Clustering with Human-Machine Cooperation
Human–Machine Cooperations (HMCs) can balance the advantages and disadvantages of human computation (accurate but costly) and machine computation (cheap but inaccurate). This paper studies HMCs in agglomerative hierarchical clusterings, where the machine can ask the human some questions. The human will return the answers to the machine, and the machine will use these answers to correct errors i...
متن کاملP35: How to Manage Anxiety
Anxiety is a mental state that is elicited in anticipation of threat or potential threat. Sensations of anxiety are a normal part of human experience, but excessive or inappropriate anxiety can become an illness. Anxiety is part of the normal human experience. We may speculate that it served human survival during evolution by enhancing preparedness and alertness. However, anxious manifestations...
متن کاملPresenting a Hybrid Approach based on Two-stage Data Envelopment Analysis to Evaluating Organization Productivity
Measuring the performance of a production system has been an important task in management for purposes of control, planning, etc. Lord Kelvin said :“When you can measure what you are speaking about, and express it in numbers, you know something about it; but when you cannot measure it, when you cannot express it in numbers, your knowledge is of a meager and unsatisfactory kind.” Hence, manag...
متن کاملDetermination of the Best Hierarchical Clustering Method for Regional Analysis of Base Flow Index in Kerman Province Catchments
The lack of complete coverage of hydrological data forces hydrologists to use the homogenization methods in regional analysis. In this research, in order to choose the best Hierarchical clustering method for regional analysis, base flow and related index were extracted from daily stream flow data using two parameter recursive digital filters in 43 hydrometric stations of the Kerman province. Ph...
متن کامل